Skip to main content

HW 5

Question 1

Explain the following terms:

  1. Hypercube network: A type of computer network topology in which each node is connected to exactly n other nodes, where n is a power of 2. Each node is assigned a unique binary address, which is used to determine its position in the hypercube. The distance between any two nodes in the hypercube is defined as the number of bits that differ in their binary addresses.
  2. Store and forward routing: Each intermediate node stores the message or packet before forwarding it to the next node in the network.
  3. Cut through routing: S technique used in computer networking to forward network packets without first receiving the entire packet. Instead, the network device (such as a switch or router) reads only the header of the packet, determines the destination of the packet, and forwards it immediately to the next hop in the network.
  4. Minimal routing: The goal of minimal routing is to minimize the number of hops or intermediate nodes between the source and destination nodes to reduce latency and improve the overall performance of the network.
  5. Flow control digits (FLITS): A unit of data used in high-speed computer networks to control the flow of data between nodes. A flit is a small packet of data that contains information about the source, destination, and priority of the data being transmitted.
  6. LogP model: The model takes into account several parameters, including communication time, computation time, and the amount of data that needs to be transmitted between nodes.
  7. Dilation in a graph embedding: Maximum number of edges in the embedding graph that any edge in the original network mapped onto.
  8. Congestion in a graph embedding: Maximum number of edges in the original graph mapped to an edge in the embedding graph.
  9. Expansion in a graph embedding: the number of nodes in the embedding graph over the number of nodes in the original graph.
  10. Network model of parallel computers: each processor computes using data in the processor and communicates 1 unit of the message along an incident edge. Repeat the compute-communicate process.

Question 2

Part 1

It takes tw×mkt_w \times \frac{m}{k} to send a fraction of the message in one hop. As soon as the first segment of the message is sent, we send the second. Therefore it takes tw×mk×(k1)t_w \times \frac{m}{k} \times (k - 1) until sending the last segment. The last segment will then take tw×d×mkt_w \times d \times \frac{m}{k} to be sent to PdestinationP_{\text{destination}}.

In total, it takes ts+tw×mk×(k1)+tw×d×mk=ts+tw×mk×(k1+d)t_s + t_w \times \frac{m}{k} \times (k - 1) + t_w \times d \times \frac{m}{k} = t_s + t_w \times \frac{m}{k} \times (k - 1 + d)

Part 2

ts+tw×d×mk×k=ts+tw×d×mt_s + t_w \times d \times \frac{m}{k} \times k = t_s + t_w \times d \times m

Question 3

  1. The overall latency for sending one message is 8+2+2=128 + 2 + 2 = 12.

  2. The overall latency for sending nn messages is 8+2+2n=10+2n8 + 2 + 2n = 10 + 2n.

  3. Broadcast procedure:

    • At time 00, P0P_0 transmits to P1P_1.
    • At time 22, the message sent by P0P_0 to P1P_1 enters the network; P0P_0 transmits to P2P_2.
    • At time 44, the message sent by P0P_0 to P2P_2 enters the network; P0P_0 transmits to P3P_3.
    • At time 66, the message sent by P0P_0 to P3P_3 enters the network.
    • At time 1010, the message arrives at P1P_1.
    • At time 1212, the message is processed at P1P_1; the message arrives at P2P_2.
    • At time 1414, the message is processed at P2P_2; the message arrives at P3P_3.
    • At time 1616, the message is processed at P3P_3.

    The overall latency is 1616.

Question 4

  1. Each processor communicates p×(np)2\sqrt{p} \times (\frac{n}{\sqrt{p}})^2 for a total of p×p×(np)2=O(n2p)p \times \sqrt{p} \times (\frac{n}{\sqrt{p}})^2 = O(n^2 \sqrt{p})

  2. Total (parallel) time is p×2(np)2=2(n2p)\sqrt{p} \times 2(\frac{n}{\sqrt{p}})^2 = 2(\frac{n^2}{\sqrt{p}}) because pp threads are in parallel. We time 2 because we are communicating matrices A and B.

  3. Total computation time is p2(np)3=2(n3p)\sqrt{p} \cdot 2(\frac{n}{\sqrt{p}})^3 = 2(\frac{n^3}{p}). Total execution time is 2(n2p)+2(n3p)2(\frac{n^2}{\sqrt{p}}) + 2(\frac{n^3}{p})

Question 5

f:(i,j)i,0i<p,0j<pif : (i, j) \rightarrow i, 0 \leq i < p, 0 \leq j < p-i

g:((i,p1i),(i+1,p2i))(i,i+1),0i<p1g: ((i, p - 1 - i), (i + 1, p - 2 - i)) \rightarrow (i, i + 1), 0 \leq i < p - 1

  • Congestion: pp
  • Dilation: p1p-1
  • Expansion: p(1+p)p2=21+p\frac{p}{(1+p)p\over 2} = \frac{2}{1+p}

Question 6

Part 1

  • Dilation: 1 (Same path for traversing across a row since there's no wrap-around)
  • Congestion: nn (nn nodes in a column mapped into one)

Part 2

It's the same thing as before

  • Dilation: 1 (Same path for traversing across a column since there's no wrap-around)
  • Congestion: nn (nn nodes in a row mapped into one)